Oum el-Bouaghi Province
- North America > United States (0.46)
- Europe > Austria > Vienna (0.14)
- Europe > Italy (0.04)
- (21 more...)
- Media > Film (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Transportation > Air (0.94)
- Transportation > Infrastructure & Services > Airport (0.69)
- Europe > Austria > Vienna (0.14)
- Europe > Italy (0.04)
- Europe > United Kingdom > England (0.04)
- (21 more...)
- Transportation > Air (1.00)
- Media > Film (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Transportation > Infrastructure & Services > Airport (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Oh, Jio, Kim, Soyeon, Seo, Junseok, Wang, Jindong, Xu, Ruochen, Xie, Xing, Whang, Steven Euijong
Large language models (LLMs) have achieved unprecedented performance in various applications, yet their evaluation remains a critical issue. Existing hallucination benchmarks are either static or lack adjustable complexity for thorough analysis. We contend that utilizing existing relational databases is a promising approach for constructing benchmarks due to their accurate knowledge description via functional dependencies. We propose ERBench to automatically convert any relational database into a benchmark based on the entity-relationship (ER) model. Our key idea is to construct questions using the database schema, records, and functional dependencies such that they can be automatically verified. In addition, we use foreign key constraints to join relations and construct multihop questions, which can be arbitrarily complex and used to debug the intermediate answers of LLMs. Finally, ERBench supports continuous evaluation, multimodal questions, and various prompt engineering techniques. In our experiments, we construct an LLM benchmark using databases of multiple domains and make an extensive comparison of contemporary LLMs. We observe that better LLMs like GPT-4 can handle a larger variety of question types, but are by no means perfect. Also, correct answers do not necessarily imply correct rationales, which is an important evaluation that ERBench does better than other benchmarks for various question types. Code is available at https: //github.com/DILAB-KAIST/ERBench.
- Europe > Austria > Vienna (0.14)
- Europe > Italy (0.04)
- Europe > United Kingdom > England (0.04)
- (19 more...)
- Media > Film (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Transportation > Infrastructure & Services > Airport (0.69)
- Transportation > Air (0.69)
A New Dynamic Distributed Planning Approach: Application to DPDP Problems
In this work, we proposed a new dynamic distributed planning approach that is able to take into account the changes that the agent introduces on his set of actions to be planned in order to take into account the changes that occur in his environment. Our approach fits into the context of distributed planning for distributed plans where each agent can produce its own plans. According to our approach the generation of the plans is based on the satisfaction of the constraints by the use of the genetic algorithms. Our approach is to generate, a new plan by each agent, whenever there is a change in its set of actions to plan. This in order to take into account the new actions introduced in its new plan. In this new plan, the agent takes, each time, as a new action set to plan all the old un-executed actions of the old plan and the new actions engendered by the changes and as a new initial state; the state in which the set of actions of the agent undergoes a change. In our work, we used a concrete case to illustrate and demonstrate the utility of our approach.
- North America > United States > California > San Mateo County > San Mateo (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Africa > Middle East > Algeria > Oum el-Bouaghi Province > Oum el Bouaghi (0.04)
- (27 more...)
Cryptanalysis and improvement of multimodal data encryption by machine-learning-based system
With the rising popularity of the internet and the widespread use of networks and information systems via the cloud and data centers, the privacy and security of individuals and organizations have become extremely crucial. In this perspective, encryption consolidates effective technologies that can effectively fulfill these requirements by protecting public information exchanges. To achieve these aims, the researchers used a wide assortment of encryption algorithms to accommodate the varied requirements of this field, as well as focusing on complex mathematical issues during their work to substantially complicate the encrypted communication mechanism. as much as possible to preserve personal information while significantly reducing the possibility of attacks. Depending on how complex and distinct the requirements established by these various applications are, the potential of trying to break them continues to occur, and systems for evaluating and verifying the cryptographic algorithms implemented continue to be necessary. The best approach to analyzing an encryption algorithm is to identify a practical and efficient technique to break it or to learn ways to detect and repair weak aspects in algorithms, which is known as cryptanalysis. Experts in cryptanalysis have discovered several methods for breaking the cipher, such as discovering a critical vulnerability in mathematical equations to derive the secret key or determining the plaintext from the ciphertext. There are various attacks against secure cryptographic algorithms in the literature, and the strategies and mathematical solutions widely employed empower cryptanalysts to demonstrate their findings, identify weaknesses, and diagnose maintenance failures in algorithms.
- Africa > Middle East > Algeria > Tébessa Province > Tébessa (0.04)
- Africa > Middle East > Algeria > Oum el-Bouaghi Province > Oum el Bouaghi (0.04)
- Asia > Japan (0.04)
- (17 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.67)
Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Naggita, Keziah, LaChance, Julienne, Xiang, Alice
Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.
- Asia > Brunei (0.14)
- North America > Canada > Quebec > Montreal (0.06)
- Africa > Sierra Leone (0.06)
- (142 more...)
- Health & Medicine (0.92)
- Information Technology > Services (0.75)
- Government > Regional Government (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)